engine: add support for finding a plan's affected shards by vmg · Pull Request #8681 · vitessio/vitess

vmg · 2021-08-25T13:45:10Z

Description

I just had a very productive unproductive morning! After discussing the next steps for our buffering work with @harshit-gangal, he suggested that I implement a GetExecShards method on all our plan Primitive types: this would allow us to figure out all the shards that would be affected by executing a given plan, so the new buffering code knows whether a given plan would require buffering because a given shard is currently undergoing a disruption event.

After implementing the whole logic for all our primitive types, I quickly noticed that this is not a feasible step forward: for many of the most common primitives, figuring out the reachable shards for the plan is essentially as expensive as actually executing the plan itself (in fact, for some primitives, we need to actually execute parts of the plan in order to accurately gather the reachable shards for the full execution).

I'm opening this PR nonetheless for further discussion, and as a future reference. Although the functionality is complete and working, I'm not sure there's a compelling reason to have it merged.

cc @harshit-gangal @deepthi

Related Issue(s)

Request Buffering during Sharding #8462

Checklist

Should this PR be backported?
Tests were added or are not required
Documentation was added or is not required

Deployment Notes

Signed-off-by: Vicent Marti <vmg@strn.cat>

harshit-gangal · 2021-08-25T15:23:34Z

This would have not worked for not just computation work but also, there are a lot of interdependent primitives that needs results from input primitive to determine which shard the current primitive should send the query to.

Also, there are cases like auto_increment sequences which will give a value that says the query will go to a particular shard but that value is lost when the query is actually executed as it will try to get the next sequence value.

The other approach we talked about was that buffering still happens at tabletgateway level and the buffering event will provide the information as to execute the query one the event is complete (in case of reparenting) or re-execute the plan (in case of resharding).

This also has one issue when it touches lookup vindexes for an insert or update.
It can lead to wrong entries in those tables.

Example

user has id column as auto_increment and name_id_map is lookup vindex
-> insert into user(name) values ('a')
lookup_entry -> insert into name_id_map('a', 1, <ks_id>)

Transaction open on lookup keyspace shard.

Buffer on actual insert statement -> insert into user(id, name) values (1, 'a')

on re-execution of plan
lookup_entry -> insert into name_id_map('a', 2, <ks_id>)
transaction open on same or different shard in lookup keyspace

insert into user(id, name) values (2, 'a')

now, a select query will fail if name_id_map is an unique lookup vindex as it tries to map to multiple shards for the input.

harshit-gangal · 2021-08-25T15:28:24Z

The best way, I see is that instead of re-execution of the complete plan, we have to put retries inside the engine primitives so that it execute those specific failed queries which the tabletgateway will ask to retry with new shard destination.

engine: add support for finding a plan's affected shards

2a9ed07

Signed-off-by: Vicent Marti <vmg@strn.cat>

vmg requested review from harshit-gangal and systay as code owners August 25, 2021 13:45

vmg mentioned this pull request Aug 27, 2021

gateway: use keyspace events when buffering requests #8700

Merged

3 tasks

vmg closed this Sep 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

engine: add support for finding a plan's affected shards#8681

engine: add support for finding a plan's affected shards#8681
vmg wants to merge 1 commit intovitessio:mainfrom
vmg:vmg/plan-shards

vmg commented Aug 25, 2021

Uh oh!

harshit-gangal commented Aug 25, 2021

Uh oh!

harshit-gangal commented Aug 25, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vmg commented Aug 25, 2021

Description

Related Issue(s)

Checklist

Deployment Notes

Uh oh!

harshit-gangal commented Aug 25, 2021

Uh oh!

harshit-gangal commented Aug 25, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants